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Abstract 


In  this  paper  we  discuss  a  Gaussian  random  field  that  arises  in  pattern  analysis.  This  random  field  exhibits 
phase  transitive  behavior  for  a  particular  value  of  the  temperature  parameter.  We  analyze  this  kind  of 
non  singular  behavior  and  the  effect  that  it  has  on  the  field  random  variables.  The  limiting  specific  heat 
also  exhibits  a  phase  transition  with  a  power  law  behavior. 


Section  1.  Introduction 


One  of  the  principal  aims  of  statistical  mechanics  is  to  derive  the  thermodynamic  behavior 
of  macroscopic  bodies  beginning  from  a  description  of  their  microscopic  components.  A 
good  deal  of  work  has  been  done  on  modelling  ferromagnetic  and  antiferromagnetic  behav¬ 
ior.  A  magnet  can  be  considered  to  have  a  large  number  of  magnetic  domains,  to  each  of 
which  a  magnetic  spin  is  associated  that  represents  the  direction  of  magnetization  at  that 
domain.  We  usually  assume  that  the  spins  take  two  values,  0  and  1.  The  physical  models 
usually  postulate  that  these  domains  are  sites  (or  vertices)  in  a  graph. 

An  undirected  graph  Q  =  (A,  e)  consists  of  a  set  of  vertices,  A  and  an  edge  set,  e.  The 
elements  of  e  axe  unordered  pairs  ( x,y ).  x,y  6  A;  when  (x,y)  G  e  we  say  that  that  there  is 
an  edge  of  the  graph  between  x  and  y,  or  that  x  and  y  are  neighbors.  We  shall  assume  that 
(x,  x)  g  e,  i.e.  the  graph  has  no  loops.  As  an  example,  consider  a  4  neighbor  n  x  n  lattice 
graph  in  the  plane.  A  vertex  of  this  graph  is  an  ordered  pair  0  <  i.j  <  n  —  1,  the 
edge  set  is  defined  as  follows;  each  point  has  four  neighbors, where  (i  —  1,  j),  (i  4- 1,  j),  (i>j  — 
1),  (»,j  +  1)  ,  i  —  1,  i  dr  1  etc.  are  calculated  modulo  n.  Thus  (n.n)  is  identified  with  0, 
and  this  graph  is  actually  a  torus.  We  shall  be  seeing  this  graph  again  in  Section  3.  of  this 
essay.  In  general,  the  way  we  define  the  graph  neighborhood  structure  is  dictated  by  our 
knowledge  of  the  influence  of  different  sites  on  each  other. 


1 


We  have  spins  at  even'  site  in  the  graph,  and  a  probability  model  is  fully  specified  as  soon 
as  we  put  a  joint  distribution  on  these  spins.  These  models  are  supposed  to  tell  us  which 
configurations  of  spins  are  more  likely  than  the  others.  Physical  models  usually,  assume 
that  the  joint  distribution  of  the  spins  is  a  Markov  random  field.  A  Markov  random  field 
is  a  probability  distribution  on  the  set  of  spins  in  the  graph  for  which  the  conditional 
distribution  of  spins  on  a  set  A,  given  all  other  spins  in  the  graph  equals  the  conditional 
distribution  of  the  spins  on  A,  given  the  spins  immediately  bordering  A.  Markov  random 
fields  are  identical  to  the  so  called  Gibbs  distributions  with  nearest  neighbor  potentials, 
provided  that  every  A  is  given  positive  probability.  Presently,  we  shall  give  a  precise 
definition  of  Gibbs  distributions  with  nearest  neighbor  potentials.  Preston  [1]  has  a  more 
complete  discussion  of  Markov  random  fields  and  Gibbs  states. 

A  nearest  neighbor  Gibbs  distribution  is  defined  as  follows: 

Let  Q  =  (A,  e)  be  the  finite  graph  with  vertex  set  A  and  edge  set  e.  A  set  BC  A  is  called  a 
simplex  of  Q  if  for  all  x  €  €  B,x  ~L  y,  there  is  an  edge  between  x  and  y  in  the  graph 

Q.  Simplices  are  also  sometimes  referred  to  as  cliques. 

Let  J  be  a  real  valued  function  defined  on  subsets  of  A  such  that  J(0)  =  0.  The  function 
J(-)  is  called  a  potential.  Let  A  be  a  non-empty  subset  of  A.  Define  the  probability  of  A, 


/ 


tt(.4)  =  Z  1  exp 


\ 


E 


J(B) 


.  B^A 
\b  a  simplex  of  Q 


(1.1) 


This  is  the  probability  that  all  sites  in  A  have  spin  1  and  the  rest  have  spin  0.  We  could 
generalize  this  to  let  the  spins  assume  arbitrary  real  (or  complex)  values.  We  shall  consider 
these  kinds  of  distributions  below,  (see  (1.2)). 

Definition  1.1 


Let 

Ldn  =  {(iu...,id):0<ik  <n-l,k  =  l,...d)}. 

For 

i  =  (fa , ....  z'rf)  €  Ld  and  j  =  €  Ldn 

let,  i-p  j  =  (fi  ©  ©  jd),  where  ©  is  addition  modulo  n. 

Note  For  the  rest  of  this  essay,  i,  j,  k,  1  will  denote  d  dimensional  vectors  in  Ldn. 

We  shall  consider  graphs  with  vertex  set  L The  edge  set  will  be  defined  as  follows.  First 
specify  a  neighborhood  N0  of  0,  and  then  define  a  neighborhood  of  i  €  Ld  by  i-pj,  j  €  N0. 
These  graphs  are  said  to  be  isotropic,  i.e.  the  neighborhood  structure  is  the  same  for 
all  vertices.  In  fact,  the  example  that  we  had  introduced  earlier  was  just  a  special  case 
with  d  =  2,  and  No  =  (0, 1),  (1,  0),  (0,  —1),  (  —  1, 0).  ATo  is  usually  taken  to  be  a  symmetric 
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neighborhood  of  0.  Let  {Ar;nri  €  Ldn)  be  a  collection  of  random  variables  on  the  lattice  Ld 
with  joint  distribution  defined  by 

Zn(T)~l  exp (-F(x)T-1)  x  ni€L,Q(x5),  (1.2) 

where  Q(-)  is  a  density  on  3L  Zn(T )  is  called  the  partition  function.  The  function  H  : 
Sftn**  Sft  is  called  the  Hamiltonian.  We  can  look  at  two  different  types  of  Hamiltonians, 
ferromagnetic  and  antiferromagnetic.  Ferromagnetic  Hamiltonians  increase  as  neighboring 
spins  become  more  alike,  and  antiferromagnetic  Hamiltonians  increase  as  neighboring  spins 
become  less  alike.  For  instance,  if  the  Hamiltonian  in  (1.2)  is  given  by 

H(x)  =  53  cj(a'»  -2i+j)25  (1-3) 

ieL*  j<S/v’o 


then  it  is  ferromagnetic  if  the  cj’s  are  positive,  and  is  antiferromagnetic  if  the  cj!s  are 
negative. 

Section  2.  Summary  of  results. 


Our  study  is  based  on  an  unpublished  manuscript  of  Grenander  and  Sethuraman  [2],  They 
defined  a  class  of  Gaussian  Markov  random  fields  on  the  lattice  Ld.  These  fields  were 
trigonometrically  interpolated  to  [0, 1] d  and  the  convergence  of  these  processes  were  studied 
in  some  simple  cases.  Grenander  and  Sethuraman  [2]  studied  probability  distributions  like 
those  specified  in  (1.2)  with  Hamiltonians  given  by  (1.3).  The  cj  were  allowed  to  depend 
on  n.  However,  we  shall  assume  that  the  cj  are  fixed.  Our  fields  are  also  Gaussian  Markov 
random  fields.  A  parameter  T  is  present  which  will  play  the  role  of  temperature  in  one  of 
our  models.  We  obtained  the  following  results: 

1.  The  variance  of  the  field  variables  grows  faster  for  T  >16,  than  for  T  —  16. 

2.  The  limiting  specific  heat  diverges  as  T  — +  16. 

3.  This  divergence  takes  place  at  a  rate  proportional  to  (T  —  16) -1  .  This  is  called  a 
power  law  behavior  at  T  —  16. 

4.  The  sum  of  squares  of  the  field  random  variables  satisfy  a  different  central  limit  law 
for  T  =  16  as  compared  to  T  >  16. 


Section  3.  A  Gaussian  model. 


We  shall  define  a  Gaussian  model  as  shown  below  .  Let  be  a  collection  of  random 

variables  indexed  by  the  lattice  L ^  with  a  joint  p.d.f  given  by 


exp 


( gjeAo  ~  VHj)2 
S  2  T 


(3.1) 


o 


Ko  —  Mo  U  —Mq  is  a  symmetric  neighbourhood  of  0.  Here  Cj  represent  the  interaction 
between  y\  and  ys+j,  are  independent  of  n  and  reflect  the  isotropic  structure  of  the  graph. 
T  is  the  usual  temperature  parameter  that  we  saw  in  (1.2). 


This  joint  density  can  be  put  in  the  form 


where  A  is  of  the  form 

A;, i  =  Ao,o  =  1  -  2 T~l  Y'  Cj  Vi  €  L7n 

5ENo 

.4ijl+j  =  0if  (3-2) 

=  Ao  j  =  — ~  ^  j  €  No  Vi  G  L2n. 

A  is  a  circulant  matrix,  and  has  eigen- vectors 

Ck  =  (exp(i27T  <  k,j  >  /n),j  €  T^),k  6  L2n. 

The  eigen  values  of  A,  Ak  are  given  by 


Ak  =  Y  -4ojexp(i2-  <  k,j  >  /n) 

=  ^  A0  j  exp ({2/7  <  k,  j  >  / n ) 
j£A?o 

=  A0,0  4-2  Y  (CJ  +  c-j)  cos(2"  <  k,  j  >  Jn) 

=  1  -  2T_1  Y  (CJ  +  c-j)[1  ~  cos(2tt  <  k,  j/n  >))  (by  (3.2)). 
j€Mo 


(3.3) 


The  eigenvalues  and  eigenvectors  depend  on  n,  which  for  reasons  of  clarity  has  not  been 
included  in  the  notation. 


We  will  now  specialize  this  model  to  the  four  neighbor  lattice  graph  that  we  had  defined 
in  Section  1.,  that  is,  we  will  assume  that  Mq  =  {(1, 0),  (0, 1)},  we  will  also  assume  that 
all  the  Cj=l,  and  that  n  is  odd.  Since  the  cj  are  positive,  the  model  is  antiferromagnetic. 


Now  by  (3.3),  the  eigen  values  of  A  are  given  by 


Ak  =  1  -  ~ll  -  cos( - -)  -f  1  -  cos( - -)]. 

In  n 


(3.4) 


Since  A  is  the  inverse  of  the  variance-covariance  matrix  of  the  Y's,  we  have  that 


Var( Y™)  =  Vai(y0<”>)  =  n‘2  £  +■.  (3.5) 

U6L*  k 

Notice  that  the  y’s  have  a  legitimate  p.d.f  if  T  >  16  since  all  the  eigen  values  of  A  are 
positive,  by  (3.4).  The  two  theorems  below  will  study  the  Tate  of  growth  of  the  variance’ 
of  the  Y’s. 

\ 

Theorem  3.1 


Suppose  that  T  >  16,  then,  for  each  i  € 

Var(y.^)  — »  j  (1  —  4T-1(1  —  cos(2/rs)  -f  1  —  cos(27ry)))-1  dxdy  as  n  — +  co.  (3.6) 
Proof.  By  (3.5), 

■\ar(y.  )  n  [1  -  A[l  —  cos(27rfci/n)  +  1  -  cos(27rfc2/n)] 

The  rest  is  trivial.  □ 

Remark  When  T  =  16  the  integral  in  (3.6)  diverges.  However,  with  a  different  normal¬ 
isation,  the  variance  of  the  normalised  y’s  converges  to  a  finite  constant,  as  studied  in 
Theorem  3.2  below. 

Theorem  3.2 
Let  T  =  16,  then 

Var((\/logn)~1y^n^)  — t  A/tt  as  n  — »  oo. 


Proof.  By  (3.5),  we  can  write 

Vn  =Va r((%/l^)-1yi(n)) 

—  (n-\ /lo°"  “4  \  . .  . . 

1  *r  cos(2t r/ci/n)  -f  1  -f  cos(27rA)2/n.) 

Let  0  <  €  <  1  and  let  0  <  <5  <  1/2  be  such  that 

2^2i2(1  —  e)  <  1  —  cos(2-:r)  <  2j72i!(1  4-  e)  if  0  <|  x  [<  S. 


(3.S) 


(3.9) 
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Let  h  ~  h-(n- l)/2,  Z2  =  A-2-(n-l)/2,  anda(/i,/2)  =  [l-cos(2^-/i/n)-rl-cos(2-/2/n)]. 
Then  Vn  can  be  written  as 


Vn  =  4(n2  logn) 


E 


_ 1_ _ ' 

[1  —  cos(2 ~h/n)  -f  1  —  cos(2Tr/2/n)]. 


=  4(n2  log  n)  1 


|l!l>n<  or 


1 

a(h,h) 


(3.10) 


+  4(n2logn)  1 


E 

0<|fil<n< 

0<|ls|<n< 


1 

O’(hih) 


=  Vn,!  +  14,2  ,  respectively. 


Now  observe  that  a(h,  Z2)  >  (1  —  cos(2t t6))  in  the  region  {|  Z2  |>  nS  or  |  Z2  |>  n8).  Hence, 
Vn,i  <  4(logn)-1(l  —  cos(27r<5))-1  =  o(l)  as  n  — *•  co. 


Then  by  (3.9)  we  have  that 

2tt2(1  —  e)(Z2  -f  l\)  <  n2a(Zj,  Z2)  <  2tt2(1  +  e)(Z2  +  Zj),  (3-11) 

if  | ?i |  <  n8,  |Z2|  <  n8. 


By  (3.11), 


4(tt2(1  -f  e)logn)  1 


E 

0<  Mi  |  <  n  £ 

0<\l2\<n6 


1 

af+zi) 


<  V«,2  (3.12) 

<  4(^(1  -«) logn)"'  £ 

0<|I2|<nf 

From  Lemma  3.1  (below)  and  (3.12),  it  follows  that  Vni2  — *  4/x  as  n  — +  oo.  Hence,  by 
(3.10)  we  know  that  Vn  converges  to  4/tt  as  n  — »  co.  This  completes  the  proof  of  the 
theorem.  □ 


Lemma  3.1 


Let  I\n  =  (log  n)  1 


£  (q  -f  qy  Then’ Kn 

0  <  |  *  j  |  <  n  V  1 

o<  iz2i<n 


2tt  as  n  — *  oo. 


r » 

D 


Proof.  This  lemma  is  proved  by  finding  an  upper  and  lower  bound  to  I\  Using  the 
inequality, 

(3.13) 


< 


(h  ~  l)2  +  (h  ~  l)2  -  (ZJ+/2) 

for  (Ja  _  ij2  _  l)  <  (x,y)  <  it  follows  that  Kn  is  bounded  above  by 

dxdy 


(log  n)  1  [ 


(log  n)  1  f 

J  i< 


< 


s2  -f  y2 


dxdy 

<iJ  +  y2<2n2  2:2  +  y2 
2r  drde 


(3.14) 


I'm  rnvi  drd6 

=  (logn)_1  /  /  -  (by  a  polar  transformation) 

Jo  Ji  r 

=  2?r(l  -  (log  n)_1  (1  -  log  V2))  =  2tt  +  o(l). 

To  obtain  a  lower  bound,  define  Dn  =  {l<x<novl<y<n  and  (s,  y)  7=  (1, 1)}.  Then 
by  using  (3.13)  a  lower  bound  on  liTn  is  obtained  as  follows 


K„=4(logn)  1  J2 


(’?  +  >1) 


0<J 

o<;2<« 


=4(log  n)  1 


E 

2<?i<n  +  l 
2<^<n  +  X 


1 

Gi-1)2  +  (h-l)2 


=4(log  n)  :(  ^ 

2  <  /1  <  n  +  1 
or  2  <  I2  <  n  -j-  1 
('!,;*)  5*  ( 2.2) 


(!i-1)2+(;2-1)2 


0(1)) 


(3.15) 


>4(logn)  J( 

L 

>4(logn)-1( 

/“  dxdy 

1  __  0  t  .  O 

Jl<x2  +  y7<n2  %  t  y 

w  / 1 . 

+  0(1)) 


This  completes  the  proof  of  Lemma  3.1.  □ 


We  shall  now  study  the  behavior  of  the  specific  heat,  which  is  defined  by  G  n(T)  = 

T— f-^°^  — ].  We  shall  show  that  Gn(T )  — >  G(T)  where  G(T )  is  called  the  limiting 

dT2  \  n2  J 

specific  heat.  The  limiting  specific  heat  G(T )  is  proportional  to  ( T  —  16) -1  near  T  =  16. 
This  is  called  a  power  law  behavior  at  T  —  16  in  the  statistical  mechanics  literature. 


Now  note  that  A  is  the  inverse  of  the  variance  covariance  matrix  of  y's,  and  the  determinant 
of  A~l  is  the  product  of  the  eigenvalues  of  A"1.  These  eigen  values  are  the  reciprocals 
of  the  eigen  values  of  A,  since  A  is  square  symmetric.  Hence  the  partition  function  Zn  is 
given  by 


Hn  =  (27r)4  xnk€lR(Ak)~1/2. 


For  T  >  16, 


log  Zn 


-  2"1  53  log(Ak) 


TV 


:  log  2tt 


-  2"1n2 


2  loS(A0 

kei*  / 


(3.16) 


We  shall  now  calculate  the  specific  heat. 


Gn(T) 


a*_  /riogz„ 

rJT"‘  \  n- 


By  (3.16)  we  have 


d 

dT 


riogZ„\  _  l0£&7  _  2-ln-2  k>g(Ak) 

'  “  kgi* 


Tl‘ 


-r>(n2TY 1  V"  ([1  -  cqs(2ttA~i  /n)  +  1  -  cos(2;rfc2/n)]) 

kf^2  ~  tE1  -  cos(2 ~ky/n)  +  1  -  cos(27r/r2/7r)])' 


S 


Differentiating  again, we  have 

r  m  _  TdHTiogZ,) 

;  —  qj>2 


=  T 


_2 


[1  —  cos(27r/ci/n)  +  1  —  cos(2?rI-2/n)) 


iT2n\^  l1"?!1  —  cos(27r^i/n)  +  1  —  cos(2tt  )c2  j  n.)] 


+ 


[1  —  cos(27r/ci/n)  +  1  —  cos^TT^/rc)] 
T2n2  2-r/s  [1  —  4?[1  —  cos(2Tiki/n )  -pi  —  cos(2tt/:2/^)] 


JL_  V' 


8  [1  —  cos(27rfci/n)  +  1  —  cos(27rfc2/n)]2 

T3n2  [1  —  4[1  —  cos(2tt ki/n)  +  1  —  cos{2~k2jn))2  J 

The  first  two  terms  in  the  above  expression  will  cancel  out  and  leave  us  with 

Q  (T)  =  S  ([1  -  cos(27rfci/n)  +  1  -  cos(27rfc2/n)])2 

nl  J  ~  T2n2  ^  ([1  -  £[1  _  cos(27r/c1/n)  +  1  -  cos(2-^/n)])2 


This  will  converge  to 

ri 


G(T)  -  —  f  f  f1"005^ 
1  ;"T2  Jo  Jo  [1  —  4T-1[1  — 


[1  —  cos(2ttz)  +  1  —  cos(27ry)]2 


■dx  dy  as  n  — >  oo.  (3.17) 


cos(2tti)  +  1  —  cos(27ry)]]2 
This  integral  is  divergent  at  T  =  16  and  in  fact,  G(T) cc  (T  —  16)-1  for  T  >  16  as  T  — *  16. 


Thus  the  specific  heat  for  this  model  diverges. 


Section  4. 


In  this  section,  we  shall  study  the  behavior  of  the  sum  of  squares  of  the  random  variables 
Y^n\  We  shall  show  that  this  sum  of  squares  obeys  a  different  central  limit  theorem  when 
T  =  16,  as  compared  to  T  >  16.  The  reason  for  this  result,  is  due  to  the  asymptotic 
behavior  of  the  eigen  values  Ak  being  different  at  T  =  16. 


Define 

<2n  =  £  (n(n))2- 
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Then,  Qn=  IT-/Ak,  where  V*  are  i.  i.  d.  random  variables  with  a  \'i  distribution. 

Theorem  4.1 
If  T  >  16,  then 

Qn  SkeLn  k — E-»7\T(0,  1)  as  n  — +  oo. 

\/3^k€t’ 


Proof.  We  shall  check  that  Liapounov’s  conditions  [4]  for  asymptotic  normality  hold. 
Let  I<n  =  Var(Qn),  then  (Q„  -  E(Qn)) /\fK  n-^N(§,  1)  as  n  -*•  co  if  I\“2  ^  £(14  - 

V£Ll 

1)4/A4k  0  as  n  -+  oo.  Since  14  are  i.  d.,  £(14  —  l)4  is  a  constant  independent  of  k.  The 

terms  Kn  and  E  (14  —  l)4/^k  are  both  0(n2).  Hence  Liapounov’s  condition  holds, 
kei’  . 

This  proves  the  theorem  □ 

We  shall  now  study  the  behavior  of  Qn  when  T  =  16.  Theorem  4.2  is  a  central  limit 
theorem  for  Qn  when  T  =  16. 

Theorem  4.2 


If  T  —  16,  then 


{Qn  —  E(Q n)) / ri2  B  as  n  — *•  oo  where  B  has  an  m.g.f.  given  by 

" _ (s 

n  ( 7  - 

J=2 


^)  =  exp(-2-g.(._1)^.I). 


Proof.  We  shall  calculate  the  m.g.f  of  ( Qn  —  E{Qn))jn2  and  show  that  this  m.g.f.  con¬ 
verges  to  the  m.g.f.  of  B.  This  is  sufficient  to  prove  convergence  in  distribution  of 
0 Qn-E(Qn))/n 2  to  B. 


Let  t£n(i)  be  the  m.g.f.  of  Qn  —  E(Qn)/n2.  Then, 


expO-OEk 61?  V) 

nk(l-2Uk/n2)i/2  • 


10 


Taking  logarithms  in  the  above, 


log  (^«(*))--n  2  Y  V-2  1  Y  log(l  -  2t/Akn2) 

keLJ  k€L2 

=  rTy(”'!i  E  V). 

J= 2  J  k 61* 


By  Lemma  4.1  the  term 

n-2-7  V  — +  (4-,V2j*-1(j  —  1)  as  n  — *  oo. 

Hence,  by  equation  (4.1)  and  the  above, 


xpn(t)  -+  exp  (-4 

j=2 


ju  - 1)  i 


(4.1) 


This  completes  the  proof  of  the  theorem.  □ 

Lemma  3.2  Let  T  =  16  and  let  j  >  2  be  an  integer.  Then 

n-2-7  ^kJ*  ~ 7  (47V 2,7-1  (i  —  1)  as  n  — »  co. 


Proof.  The  proof  of  this  lemma  is  substantially  the  same  as  the  proof  of  Theorem  3.1  and 
so  we  will  only  sketch  the  proof.  Since  T  =  16, 


V  =  &  E 

0<l»il<n/S 

0<|/j|<n/J 


n 


-2j 


fE 

k€l2 


Let  <5  be  as  in  the  proof  of  Theorem  3.1.  Then  the  right  hand  side  of  (4.2)  is  approximate! y 
equal  to 

E 

0<lij  |<n 
0<Ji2|<n 

1 

<r2+j/!<2n26s  (x2  +1/2)7 

x  S-7(2-2)--7V/(j  —  1). 


ii 


This  completes  the  proof  of  the  lemma.  □ 


Remark.  The  limit  in  distribution  of  Q «  is 
and  the  normalization  constant  for  Qn  at  T 
suggests  that  the  field  random  variables  Yk 


different  for  T  —  16  as  compared  to  T  >  16, 
=  16  is  n2  instead  of  n  when  T  >  16.  This 
vary  much  more  for  T  =  16  than  for  T  >  16. 


However  the  critical  behavior  is  at  the  endpoint  of  definition  of  the  model  and  so  it  is 
our  opinion  that  this  does  not  seriously  restrict  the  application  of  the  Gaussian  model  m 

pattern  analysis. 
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